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Open 



Bacteriophages encode auxiliary metabolic genes that support more efficient phage replication. For 
example, cyanophages carry several genes to maintain host photosynthesis throughout infection, 
shuttling the energy and reducing power generated away from carbon fixation and into anabolic 
pathways. Photodamage to the D1/D2 proteins at the core of photosystem II necessitates their 
continual replacement. Synthesis of functional proteins in bacteria requires co-translational removal 
of the N-terminal formyl group by a peptide deformylase (PDF). Analysis of marine metagenomes to 
identify phage-encoded homologs of known metabolic genes found that marine phages carry PDF 
genes, suggesting that their expression during infection might benefit phage replication. We 
identified a PDF homolog in the genome of Synechococcus cyanophage S-SSM7. Sequence analysis 
confirmed that it possesses the three absolutely conserved motifs that form the active site in PDF 
metal loproteases. Phylogenetic analysis placed it within the Type 1B subclass, most closely related 
to the Ambidopsis chloroplast PDF, but lacking the C-terminal a-helix characteristic of that group. 
PDF proteins from this phage and from Synechococcus elongatus were expressed and 
characterized. The phage PDF is the more active enzyme and deformylates the N-terminal 
tetrapeptides from D1 proteins more efficiently than those from ribosomal proteins. Solution of 
the X-ray/crystal structures of those two PDFs to 1.95 A resolution revealed active sites identical to 
that of the Type 1B Arabidopsis chloroplast PDF. Taken together, these findings show that many 
cyanophages encode a PDF with a D1 substrate preference that adds to the repertoire of genes used 
by phages to maintain photosynthetic activities. 
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Introduction 

Bacteriophages (phages) frequently carry metabolic 
genes acquired from host genomes. Not being 
essential for replication per se, these are termed 
auxiliary metabolic genes (AMGs) (Breitbart et aL, 
2007). The list of AMGs identified in phage genomes 
and thought to benefit the phage during lytic 
replication now includes those functioning in 
photosynthesis (Mann et aL, 2003; Sharon et al. 
2009; Sharon et aL, 2011), the pentose phosphate 
pathway (Thompson et aL, 2011), phosphate acqui- 
sition (Goldsmith et aL, 2011; Zeng and Chisholm, 
2012), nucleotide metabolism (Rohwer et aL, 2000; 
Mann et aL, 2005; Sullivan et aL, 2005; Dinsdale 
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et aL, 2008) and cytoskeletal construction (Kraemer 
et aL, 2012), among others. 

Marine cyanophages carry a rich array of AMGs, 
including genes encoding components of photosystem 
II (PSII). PSII is subject to photodamage, particularly 
the Dl and D2 proteins that compose the functional 
heterodimer at its core and that are responsible for the 
binding of pigments and cofactors necessary for 
primary photochemistry (Millard et aL, 2004). To 
maintain sufficient active D1/D2, the proteins are 
protected by high-light-inducible proteins that assist 
by dissipating excess light energy, and are also 
regularly recycled and replaced (Millard et aL, 2004). 

During a cyanophage infection, continuing photo- 
synthesis is required to provide both energy (ATP) 
and reducing power (NADPH) for phage replication 
(Mann et aL, 2003; Lindell et aL, 2005). Numerous 
cyanophages encode Dl proteins plus at least one 
other component of PSII (Lindell et aL, 2005). 
During infection of Prochlorococcus MED4 by the 
T7-like cyanophage P-SSP7, phage Dl protein is 
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expressed and partially compensates for the decline 
in host Dl synthesis (Lindell et al., 2005). Other 
factors likely contribute as well to the maintenance 
of photosynthetic activity during infection, for 
example, phage-encoded high-light inducible pro- 
teins. On the other hand, following infection of 
Synechococcus sp. WH7803 by S-PM2 phage, host 
Dl [psbA] mRNA initially increases 18-fold and 
remains slightly elevated even late in infection, 
while a high level of phage Dl transcripts was 
sustained throughout (Clokie et al., 2006). Clearly 
Dl synthesis, either from phage or host genes, is a 
priority during cyanophage infection. 

During protein translation in bacteria, the first 
amino-acid residue inserted is N-formylmethionine. 
The formyl moiety is then co-translationally 
removed from at least 98% of bacterial proteins by 
a peptide deformylase (PDF) (personal communica- 
tion; Sharon et al., 2011; reviewed in Meinnel and 
Giglione, 2008). Frequently the N-terminal methio- 
nine is also removed by methionine aminopeptidase. 
Both modifications take place immediately as the 
nascent polypeptide chain emerges from the portal 
of the ribosome exit tunnel, before protein folding 
blocks enzyme access to the N-terminus. In E. coli, 
deletion of the PDF gene is lethal (Mazel et al., 1994). 

Inhibition of PDFs by actinonin, a naturally 
occurring antimicrobial, inhibits bacterial growth 
(Chen et al., 2000), and actinonin treatment in vivo 
also leads to decline of photosynthetic function of 
the plastids in plants and green algae. In the 
unicellular alga Chlamydomonas, actinonin desta- 
bilizes D2, shunting it to a degradative pathway, 
thus interfering with the assembly of PSII (Giglione 
et al., 2003). In vascular plants, where the transla- 
tion rate of Dl is 50-100 times greater than those of 
other PSII proteins (Kyle et al, 1984), the effect of 
actinonin on Dl synthesis and assembly is most 
pronounced (Hou et al., 2004). Therefore, the 
chloroplast PDF is essential for maintenance of PSII. 
If this N-terminal methionine excision system is 
overwhelmed, as might be expected by the high 
level of protein synthesis during phage replication, 
improperly processed and non-functional proteins 
would result. Cyanobacteria, important aquatic 
primary producers, share high functional similarity 
to chloroplasts (Giovannoni et al., 1988), but it is not 
known if their PDF shares functional or structural 
similarity to the PDF of chloroplasts. 

PDFs are a subclass of the metalloprotease super- 
family of enzymes known as the 'clan MA and MB' 
metalloproteases. Proteins from this family share a 
common structure containing a three-stranded (3 
strand facing a catalytic metal and a HEXXH motif- 
containing oc helix (Giglione et al., 2004). Although 
PDFs display variability in amino-acid sequence and 
overall length, they share three absolutely conserved 
and unique motifs that together form the entire 
active site: the G-motif (GOGOAAXQ where O is 
any hydrophobic amino acid and X is any amino 
acid), H-motif (QHEXDHLXG) and C-motif (EGCXS) 
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(Giglione et al., 2004). PDFs are classified into three 
types based on sequence homologies and structural 
distinctions. Type 1 PDFs (PDFl) are found in 
bacteria and bacterial-derived organelles of eukar- 
yotes (plastids, mitochondria and apicoplasts), while 
Type 2 PDFs are restricted to the Gram-positive 
Bacteria. These PDFs all possess the three conserved 
motifs and all act on the same substrates (N-formyl 
methionine polypeptides), although some variation 
in substrate specificity has been found in that the 
high-turnover Dl protein from PSII is a preferred 
substrate for the PDF in plant plastids (Dirk et al., 
2008). These active PDFs differ in the secondary 
structure of their C-terminal domain; it is an oc helix 
in Type 1 proteins and a supplementary (3 strand in 
Type 2 proteins. Within the PDFl, the C-terminal 
oc-helix displays variable length and is not required 
for catalytic activity (Meinnel et al., 1996). 

The less-studied Type 3 PDFs, found in Archaea 
and in the mitochondria of trypanosomatids, differ 
in their 'conserved' motifs. Archaea do not use 
N-formylmethionine for translation initiation; the 
formylated substrate for the archaeal PDFs is 
currently unknown (Bouzaidi-Tiali et al., 2007). 

Evidence that phages encode AMGs has often 
come from the identification of metabolic homologs 
in sequenced phage genomes. Recently a novel 
method was used to search marine metagenomes 
for virally-encoded microbial metabolic genes 
(Sharon et al., 2011). By recognizing phage scaffolds 
in these marine metagenomes and then identifying 
metabolic genes within them, this approach is not 
limited to sequenced genomes. The most abundant 
metabolic genes found by this survey were PDFs, 
thereby predicting that PDFs are a significant part of 
the metabolic repertoire carried by marine phages. 

Bringing all these threads together, we hypothe- 
sized that encoding and expressing a PDF could be 
one of the strategies used by cyanophages to 
manipulate host metabolism during infection. Here, 
we report identification of a PDF gene in the genome 
of a marine cyanophage (S-SSM7) and solution of its 
crystal structure. The PDF encoded by cyanophage 
S-SSM7 belongs to the Type IB PDF subgroup, 
which is widely distributed among marine phage. 
We also determined the enzymatic activity and 
structure of the previously annotated PDF protein 
from Synechococcus elongatus PCC 6301. Compar- 
ison of these enzymes demonstrates that the S-SSM7 
enzyme is highly active with a substrate preference 
for the high-turnover PSII protein Dl, consistent 
with our hypothesis that phage encode PDFs to 
manipulate host metabolism and help sustain 
photosynthesis during infection. 



Materials and methods 

Target selection 

The gene for the Synechococcus phage PDF, 
YP_004324347, was identified in the genome of 
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Synechococcus phage S-SSM7 during a bioinfor- 
matic survey looking for unknown phage proteins 
that are abundant in environmental metagenomes 
but under-represented in the PhAnToMe database 
phage genome database (http://www.phantome. 
org/). Protein sequences identified in phage gen- 
omes using Glimmer V3.02 were compared against 
all publicly available metagenomes (from http:// 
edwards.sdsu.edu/cgi-bin/mymgdb/show.cgi) using 
tBLASTn [E value cutoff 10 " 5 ). ORFS were anno- 
tated using the PhAnToMe database with BLASTp 
[E value cutoff 10 ~ 5 ). The encoded protein is 
referred to throughout the text as the phage PDF. 

The gene for the cyanobacterial PDF, YP_1 70923, 
in the genome sequence of S. elongatus PCC 6301, 
was identified based on gene annotations in the 
NCBI Reference Sequence database (http:// 
www.ncbi.nlm.nih.gov/RefSeq/). (The original host 
strain for S-SSM7, Synechococcus WH8109, has not 
been sequenced.) The encoded protein is referred to 
throughout the text as the bacterial PDF. 



Gene design 

Starting from the amino-acid sequences for these two 
PDF proteins, gene sequences were designed for 
expression in E. coli using Gene Composer (Emerald 
BioSystems, Bainbridge Island, WA, USA) (Lorimer 
et aL, 2009; Lorimer et aL, 2011). Back-translation of 
the amino-acid sequences employed a Universal 
Codon Usage Table designed to accommodate expres- 
sion in E. coli with a minimum usage threshold of 
2%. Restriction enzyme recognition sequences for 
BamHl and Hindlll were excluded from the sequence 
to facilitate cloning. The engineered gene sequences 
were synthesized by DNA2.0 (Menlo Park, CA). The 
genes were sub-cloned into an expression vector 
containing an isopropyl-D-thiogalactopyranoside- 
inducible (IPTG-inducible) T7 promoter using the 
PIPE cloning method (Klock et aL, 2008; Raymond 
et aL, 2009; Raymond et aL, 2011). The vector 
provides an N-terminal hexahistidine-Smt tag; the 
Smt tag is specifically and efficiently removed by 
UlpI protease (Mossessova and Lima, 2000). 



Inhibitor Cocktail tablet (Roche, Indianapolis, IN, 
USA), and then lysed by sonication using a Misonix 
S-4000 sonicator (Qsonica, Newtown, CT, USA) at 
70% power (67-69 W), 2 s on/1 s off, 3min total. 
Immediately after sonication, the crude lysate was 
clarified by centrifugation at 18 000 g for 35min at 
4 °C. The tagged proteins were initially purified using 
the protein maker (Smith et aL, 2011). Briefly, lysates 
were applied to a 5-ml HisTrap FF nickel-chelate 
column (GE Healthcare, Waukesha, WI, USA) in 
25 mM Tris-HCl (pH 8.0), 200 mM NaCl, 50 mM 
arginine, 10 mM imidazole, 0.25% glycerol and ImM 
TCEP. The column was washed with three column 
volumes of the same buffer, then eluted in three 
steps: a 5-ml elution with 30 mM, a 5-ml elution with 
206 mM and two 5 ml elutions at 500 mM imidazole. 
Elution fractions containing partially purified PDF 
proteins were identified based on molecular weight 
by SDS polyacrylamide gel electrophoresis (SDS- 
PAGE). The fraction containing the target protein was 
treated with 50 ml of lmgml" 1 of 6 x His-tagged 
ubiquitin-like protease 1 overnight to cleave the Smt- 
PDF fusion and remove the His tag. The resulting 
protein was dialyzed against ll of 25mM Tris-HCl 
(pH 8.0), 200 mM NaCl, 50 mM arginine, 10 mM 
imidazole, 0.25% glycerol and ImM TCEP, and then 
run over a second nickel-chelate column as described 
above. The flow-through, wash, and elution fractions 
were analyzed by SDS-PAGE and the fraction 
containing PDF protein was concentrated using an 
Amicon lOkDa MWCO concentrator (Millipore, Bill- 
erica, MA, USA) to ~5 ml and further purified by 
size exclusion chromatography on a Sephacryl S-100 
10/300 GL column (GE Healthcare) in 25 mM Tris-HCl 
(pH 8.0), 200 mM NaCl, 1.0% glycerol and 1 mM TCEP. 
Peak fractions (3.0 ml) were concentrated to 
^lOmgml" 1 and flash frozen in liquid nitrogen in 
100 |il aliquots. Selenomethionine-labeled proteins 
were produced in E. coli BL21(DE3) cells grown in 
M9 minimal medium supplemented with 5 mg ml " 1 
thiamine, 60mgl _1 selenomethionine and other 
metabolites to inhibit methionine biosynthesis 
(Doublie, 1997), followed by purification as 
described above. 



Gene expression and protein purification 
E. coli BL21(DE3) cells expressing the engineered 
PDF gene from Synechococcus phage S-SSM7 or S. 
elongatus were cultured at 37 °C to an A600 of -0.6 
in TB medium (Teknova, Hollister, CA, USA). 
Cultures were induced with ImM IPTG and incu- 
bated overnight at 25 °C. The bacterial cells were 
harvested by centrifugation at 4 °C and the resulting 
cell paste was stored at - 80 °C. For protein purifica- 
tion, the cells were thawed in 25mM Tris-HCl 
(pH 8.0), 200 mM NaCl, 50 mM arginine, 10 mM 
imidiazole, 0.02% CHAPS detergent, 0.5% glycerol, 
ImM Tris(2-carboxyethyl)phosphine (TCEP), 100 mg 
lysozyme, 250U|il _1 Benzonase (Novagen, Madison, 
WI, USA) and one complete EDTA-free Protease 



Enzyme assays 

PDF enzyme activity was assayed using a coupled 
PDF/FDH (formate dehydrogenase) assay. The for- 
mate released by PDF from the fMAS substrate is 
oxidized by FDH with the concomitant reduction of 
NAD+ to NADH (Lazennec and Meinnel, 1997). 
NADH was measured by absorbance at 340 nm using 
a SpectraMax M2 96-well microplate reader (Mole- 
cular Devices, Sunnyvale, CA, USA). The substrate 
was ImM fMAS. Initial rates of reaction (v apP ) 
were calculated from the initial linear portion 
of plots of NADH produced versus time. iC m and V max 
were extrapolated from plots of v app versus substrate 
concentration (GraphPad Prism, La Jolla, CA, USA). 
k cat was calculated by dividing V max by enzyme 
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concentration. The inhibitor data were obtained 
using 0.2 nM enzyme. 

N-terminal tetrapeptides 

The N-terminal tetrapeptide sequences encoded by 
annotated ribosomal protein genes in S. elongatus 
PCC 6301 (Genbank: AP008231.1), Synechococcus 
sp. CC931 (Genbank: NC_008319.1), Synechococcus 
sp. CC9605 (Genbank: NC_007516.1) and Synecho- 
coccus sp. CC9902 (Genbank: NC_007513.1) were 
tabulated to yield a total 259 proteins (64, 65, 66 
and 64 from each genome). The most frequently 
occurring tetrapeptides consisted of one with six 
occurrences (fMAKK) and 17 with four occurrences 
(including fMSRV and fMARI). The peptides 
fMAKK, fMSRV and fMARI were used as substrates 
in the PDF specificity assays, compared against 
the encoded N-terminal tetrapeptides of four repre- 
sentative Dl proteins: the Dl of Synechococcus 
phage S-SSM7 (Genbank: GU071098.1) and three 
bacterial Dl proteins from S. elongatus PCC 6301 
(Table 1). 



Phylogenetic analyses 

For phylogenetic analysis, in addition to the phage 
and bacterial PDFs, 49 representative annotated PDF 
genes were selected from the sequenced genomes in 
The SEED database (http://theseed.uchicago.edu/ 
FIG/) (Overbeek et al., 2005). A multiple sequence 
alignment was generated at the amino-acid level 
using ClustalW (Chenna et al., 2003) and the tree 
was drawn using FigTree vl.31 (http://tree.bio.ed.a- 
c.uk/software/ figtree/) . 

Table 1 Tetrapeptide substrates for PDF activity assays 



Protein 


Tetra- 


Source 


Number of 




peptide 




occurrences 


Dl 








Genbank: 








GI:310003945 


fMLIS 


Cyanophage S-SSM7 




GI:56686061 


fMTSI 


Synechococcus 








elongatus PCC 6301 




GI:56685134, 


fMTTA 


S. elongatus PCC 6301 




GI:56685615 








Ribosomal proteins 






GI:56750900 


fMAKK 


S. elongatus PCC 6301 


6 


GI:56751081 




S. elongatus PCC 6301 




GI:78213979 




S. sp. CC9605 




GI:78212121 




S. sp. CC9605 




GI:78185723 




S. sp. CC9902 




GI:78185359 




S. sp. CC9902 




GI:56751466 


fMSRV 


S. elongatus PCC 6301 


4 


GI:113953218 




S. sp. CC9311 




GI:78212935 




S. sp. CC9605 




GI:78184654 




S. sp. CC9902 




GI:56751895 


fMARI 


S. elongatus PCC 6301 


4 


GI:113953764 




S. sp. CC9311 




GI:78211907 




S. sp. CC9605 




GI:78185543 




S. sp. CC9902 





Abbreviations: £M, formylmethionine; PDF, peptide deformylase. 



Global PDF distribution 

The 70 phage-encoded PDFs reported by Sharon 
et al. (Sharon et al., 2011) were compared against all 
12 672 581 sequencing reads in the Global Ocean 
Survey (GOS) metagenomes (Yooseph et al., 2007) 
using BLASTX (Altschul et al., 1990) (parameters: - 
e le-5 -F F -r 1 -q -1 -v 5 -b 5). GOS sequences with 
at least 30% amino-acid identity and 50% similarity 
to any of the 70 PDFs were tallied as phage-encoded 
PDF homologs. The normalized relative abundance 
was calculated for each sampled location as the 
number of phage-encoded PDF homologs divided by 
the total number of sequencing reads in that data set. 
The map was generated using ArcGIS version 10.1. 
The mapped chlorophyll concentrations are the 
September, 2012, values from the NASA Earth 
Observations site (http://neo.sci.gsfc.nasa.gov). 



Crystalliza tion 

Crystallization trials of the phage PDF (native and 
Se-Met) were carried out using sitting drop vapor 
diffusion at 16 °C in a drop of a 1:1 mixture of 
protein (10.17 mg ml -1 ) and reservoir solution. 
Crystals grew from 0.1m HEPES, pH 7.5; 70% (v/v) 
( + / — )-2-methyl-2,4-pentanediol. For co-crystalli- 
zation with the inhibitor, actinonin, the native PDF 
protein was incubated at 4 °C overnight with a five- 
fold molar excess of actinonin (Santa Cruz Biotech- 
nology, Santa Cruz, CA, USA) before crystallization. 
Crystals were frozen in liquid nitrogen in a solution 
of 30% (w/v) ethylene glycol for cryoprotection. 

Crystallization of the native bacterial PDF was 
achieved as described above except that the crystals 
grew in 0.15 m KBr, 30% PEG-MME 2000. 



Crystallography: phage protein 

For the apo Se-Met phage protein data set (PDB 
3UWA), X-ray diffraction data were collected at the 
Advanced Photon Source, beamline 19ID (Argonne, 
IL, USA) at a wavelength of 0.9791750 A. The 
structure was solved by Se-SAD (Supplementary 
Tables S2 and S3). Crystals display a symmetry of 
P2 1 2 a 2 a with cell dimensions a = 47.5 A, b = 58.3 A, 
c = 62.4A. For the actinonin-bound native phage 
protein data set (PDB 3UWB), the data were 
obtained using a Rigaku SuperBright FR-E + rotat- 
ing-anode X-ray generator with Osmic VariMax HF 
optics and a Saturn 944 + CCD detector. Data were 
collected to 1.7 A resolution and the structure was 
solved by molecular replacement using the phage 
structure, PDB 3UWA, as the search model 
(Supplementary Tables S2 and S3). In both cases, a 
single crystal was used for each complete data set. 



Crystallography: bacterial protein 
A data set for the apo native bacterial crystal (PDB 
4DR8) was collected at the Stanford Synchrotron 
Radiation Lightsource, beamline 7—1. Data were 
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collected to 1.55 A resolution and the structure was 
solved by molecular replacement using PDB 1LRY as 
the search model (Supplementary Tables S4 and S5). 
PDB 1LRY, the PDF from Pseudomonas aeruginosa, 
was the PDF with a solved structure that had the 
highest similarity to the Synechococcus protein. 
Crystals display Pi symmetry with cell dimensions 
of a = 43.40 A, b = 65.36 A, c = 69.03 A. To generate 
the actinonin-bound bacterial protein, the apo 
crystals were soaked overnight with actinonin 
(ImM in crystallization mother liquor). The actino- 
nin-bound data set was collected in-house as 
described for the phage data sets. Data were 
collected to 1.9 A resolution and the structure 
(PDB 4DR9) was solved by molecular replacement 
using the native structure (PDB 4DR8) as the search 
model (Supplementary Tables S4 and S5). 

Results 

Global distribution of phage PDF homologs 
The initial report of abundant phage-encoded PDFs 
in marine environments (Sharon et ah, 2011) did not 
address their biological function or the question of 
their global distribution. Therefore, we first 
searched all available GOS metagenomes for homo- 
logs of those phage-encoded PDFs using BLASTX. 
Mapping of the relative abundance of these homo- 
logs at all the GOS-sampling locations makes their 
widespread distribution apparent (Figure 1, 
Supplementary Table Si), thus demonstrating their 
importance in the environment. Additional studies 
will be required to correlate their pattern of 
distribution with ecological factors such as host 
distribution and nutrient levels; however, we have 



extensively characterized the structure and function 
of a phage-encoded PDF. 



Synechococcus phage PDF activity 
Polypeptide deformylation activity of the PDF 
protein from Synechococcus phage S-SSM7 
(referred to throughout as the 'phage PDF') was 
compared with the activity of the PDF from 
S. elongatus PCC 6301 (referred to throughout as 
the 'bacterial PDF') using a real-time PDF/FDH 
enzyme-coupled assay (see Methods). The assay 
was optimized to measure initial rates accurately 
and produced z' values >0.8 (data not shown). This 
assay was first used to evaluate the sensitivity of the 
phage PDF to inhibition by actinonin, a naturally 
occurring antibiotic demonstrated to be a potent 
inhibitor of bacterial PDFs (Chen et ah, 2000). 
Results in Figure 2a show that actinonin is a potent 
inhibitor of phage and bacterial PDFs with an 
apparent IC 50 value of 10 nM against both enzymes, 
suggesting that both enzymes have similar catalytic 
active sites. 

The deformylation activity of the phage PDF was 
assayed using formylated N-terminal tetrapeptides 
derived from Dl proteins from cyanophage S-SSM7 
and S. elongatus PCC 6301 as well as the most 
frequently occurring N-terminal tetrapeptides of 
ribosomal proteins from four Synechococcus gen- 
omes (Table 1 and Methods). The phage PDF has 
significantly higher specific activity than the bacter- 
ial enzyme on the substrates used above (average ic cat 
values of 615 versus 192 s~\ Table 2). Likewise, the 
phage PDF is significantly more efficient than the 
bacterial PDF at deformylating the Dl-derived 
tetrapeptides [k cat /K m for the phage PDF being 14.7, 




Figure 1 Global distribution of phage-encoded PDFs. GOS-sampling locations (blue dots) and relative abundances of phage-encoded 
PDFs (white circles) in the GOS metagenomes are plotted on a global map of marine chlorophyll concentrations. See Supplementary 
Table Si for the data plotted. 
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Actinonin Inhibition 



Substrate Kinetics 




Phage PDF 

I- Phage_D1 fMLIS 
»- Bact_D1 fMTSI 
*■ Bact_D1 fMTTA 



Bacterial PDF 
B- Phage_D1 fMLIS 
•- Bact_D1 fMTSI 
B- Bact_D1 fMTTA 



-8.6 -7.2 -5.8 -4.4 
Actinonin Concentration (log M) 



0.5 1 1.5 
mM Peptide 



Figure 2 (a) Inhibition of the Synechococcus phage S-SSM7 PDF (blue) and the PDF from Synechococcus elongatus PCC 6301 (red) by 
actinonin. Enzyme: 0.2 nM PDF. Substrate: 1.5 mM fMLIS, the N-terminal tetrapeptide of the Dl protein encoded by the phage, (b) Kinetic 
assays comparing the activity of the Synechococcus phage S-SSM7 PDF (blue) with that from S. elongatus PCC 6301 (red) on three 
different N-terminal tetrapeptide substrates derived from phage or bacterial Dl proteins (see Table 1). Red square = phage Dl, fMLIS; 
solid red circle = bacterial Dl, fMTSI; open red circle = bacterial Dl, fMTTA. 
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Table 2 Activity of phage and bacterial PDFs on N-terminal tetrapeptides derived from Dl proteins and cyanobacterial ribosomal 
proteins. (See Table 1 for substrate details.) 



Enzyme 



Substrate 



Phage Dl: fMLIS 



K m (nM) k cat (s 2 ) k cat /K n 



Bacterial Dl: fMTSI 



Bacterial Dl: fMTTA 



K m (nM) k cat (s 2 ) k caf /K m 



Phage 

Bacterial 

Phage/Bact 



0.31 
1 

0.3 



683 2204 
150 150 
4.6 14.7 



0.48 

1.2 

0.4 



767 1597 
183 153 
4.2 10.5 



0.52 800 1538 

1.5 183 122 

0.3 4.4 12.6 



Ribosomal proteins: fMAKK 



Ribosomal proteins: fMARI 



Ribosomal proteins: fMSRV 



Phage 

Bacterial 

Phage/Bact 



K m (n M ) 
1.3 
1.9 
0.7 



k ca t (S 

583 
167 
3.5 



k cat /K n 
449 
88 
5.1 



K m (n M ) 
0.2 
0.8 
0.3 



87 
250 
0.3 



k cat /K n 
433 
313 
1.4 



K m (n M ) 
0.6 
1.1 
0.5 



k ca t (S 

767 
217 
3.5 



kcat/K^ 

1278 
197 
6.5 



10.5 and 12.6 times that of the bacterial PDF; 
Table 2). Comparison of the substrate kinetics of 
these two enzymes on tetrapeptides derived from 
phage and bacterial Dl proteins shows the consis- 
tently greater activity of the phage PDF (Figure 2b). 
The differences in efficiency were less when three 
different tetrapeptides from ribosomal proteins 
served as the substrate (the phage efficiency here 
being only 5.1, 1.4 and 6.5 times that for the 
bacterial enzyme). Taken together, these results 
clearly show that the phage PDF is a catalytically 
active deformylase with overall properties similar to 
the bacterial PDF. 



Phage PDF specificity 

The two substrate classes (Dl and ribosomal 
proteins) were chosen to determine if either PDF 
displayed any selectivity for deformylating Dl 
proteins. Results in Table 2 show that the specificity 
constants {k cat /K m ) (Eisenthal et al., 2007) for 
the phage PDF are significantly higher for the 



Dl-derived peptides compared with the ribosomal- 
derived peptides (1780 versus 720). In contrast, 
when comparing the two substrates against the 
bacterial PDF, the opposite trend was observed, 
although the difference was not nearly as significant 
(141 versus 199). These results show that the phage 
PDF is much more efficient at deformylating the Dl 
substrates than the bacterial PDF. 



Phylogenetic analysis of cyanophage S-SSM7 PDF 
Members of the PDF family, while displaying 
variability in amino-acid sequence, share three 
essential conserved motifs (the G, H and C motifs) 
that together build the entire active site (Giglione 
et al, 2004). Alignment of the predicted phage PDF 
sequence with PDFs of known structure from 
eukaryote organelles (mitochondria and chloro- 
plasts of Arabidopsis thaliana, and the non-photo- 
synthetic plastid-derived apicoplast of Plasmodium 
falciparum) and a cyanobacterium (S. elongatus PCC 
6301) demonstrates the conservation of these motifs 
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(Figure 3a). A phylogenetic tree built using 51 PDF 
sequences from bacteria, phage and eukaryote 
organelles clusters the phage PDF closely with the 
Type IB PDF from A. thaliana chloroplasts 
(Figure 3b). Notably, unlike most bacterial PDFlB 
enzymes, the C-terminal alpha helix characteristic 
of this subtype is absent from this phage PDF. 



Enzyme structure 

Comparison of thel.95 A structure of the phage PDF 
(Figure 4a) with that of the A. thaliana chloroplast 
Type IB PDF shows striking similarity with an 
overall RMSD of 0.645 A (Figure 4b). Likewise, the 
solved structure of the bacterial PDF shows an 
identical conformation (Figure 4b). As expected 




Figure 3 Comparison of the Synechococcus phage S-SSM7 PDF amino-acid sequence with that of other PDFs. (a) Multiple sequence 
alignment of selected PDFs showing the three conserved motifs and the C-terminal domain. Included are PDFs from: A. thaliana 
mitochondria; A. thaliana chloroplasts; Synechococcus phage S-SSM7; Synechococcus elongatus PCC 6301; and P. falciparum 
apicoplast. Highlighted regions are the PDF-specific G-motif (GOGOAAXQ), C-motif (EGCXS) and H-motif (QHEXDHLXG), as well as the 
variable C-terminal domain (where = any hydrophobic amino acid and X = any amino acid). Degree of conservation of amino acids at 
each position: * = absolutely conserved; : = different but very similar amino acids; . = different but somewhat similar amino acids; 
blank = dissimilar amino acids or gaps, (b) A phylogenetic tree of 51 PDF proteins from bacteria, phage and eukaryote organelles. 





Figure 4 (a) Crystal structure of Synechococcus phage S-SSM7 peptide deformylase (PDF), (b) Overlay of phage S-SSM7 PDF (green), 
Synechococcus elongatus PCC 6301 PDF (cyan) and A. thaliana chloroplast PDFlB (magenta) shows striking similarity of protein folds, 
as well as the position of the zinc ion in the active site, (c) Evaluation of the active site residues around the zinc ion reveals strong 
conservation among these three PDFs: phage S-SSM7 (green), Synechococcus (cyan) and chloroplast (magenta), (d) Comparison of the 
residues at the entry to the active site in the phage (green), Synechococcus (cyan) and chloroplast (magenta) PDFs. Green sticks = phage 
asparagine 99; cyan sticks = Synechococcus tyrosine 116; magenta sticks = chloroplast tyrosine 178. 
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from the kinetic data, the structures of all three 
enzymes are also identical when bound to actinonin 
(data not shown), thus providing strong structural 
evidence that the phage protein is a functional PDF 
enzyme. The C-terminal oc-helix that is characteristic 
of Type IB PDF proteins is absent from the phage 
protein (Supplementary Figure Si). The amino-acid 
residues that compose the active site, which are 
conserved and are predicted to interact with a zinc 
ion, are observed in identical conformations in these 
three enzymes (Figure 4c). Comparison of the 
amino-acid residues positioned at the entry to the 
active site in these three proteins also shows 
compelling similarity (Figure 4d), with the notable 
exception that the tyrosine in both the bacterial and 
chloroplast proteins is replaced by an asparagine in 
the phage PDF. 

The catalytic zinc has been shown to coordinate 
the PDF inhibitor actinonin (Guilloteau et aL, 2002) 
and is responsible for the high affinity binding of 
this universal PDF inhibitor. The structure of the 
phage PDF bound to actinonin was determined 
(Figures 5a and b) shows the interactions between 
bound actinonin and active site residues of the 
phage PDF. 



Discussion 

The growing list of AMGs identified in phage 
genomes includes PDFs. The deformylation of the 
N-terminal formylmethionine catalyzed by PDFs is 
an essential step in the co-translational processing of 
nascent polypeptides in all bacteria and bacterial- 
derived organelles (Giglione et aL, 2004). Deformyla- 
tion, and often excision of the N-terminal methio- 
nine, is necessary for further post-translational 
processing and proper folding. Only two representa- 
tives of this enzyme family had been previously 
recognized in sequenced phage genomes, those being 
in two phages of Vibrio parahaemolyticus (Seguritan 
et aL, 2003). However, virus-affiliated PDF genes 
were recently found to be abundant in marine 
metagenomes (Sharon et aL, 2011). All of these 
sightings relied solely on homology; enzymatic 




Figure 5 (a) Crystal structure of Synechococcus phage S-SSM7 
PDF binding the inhibitor actinonin. (b) Interaction of actinonin 
with the active site residues of the phage PDF showing polar 
interactions. Residues from the C- and G-motifs are shown as 
green sticks, actinonin as orange sticks. 
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activity had not been demonstrated for any phage- 
encoded PDF. 

We report here the identification of a gene 
encoding a predicted PDF in the sequenced genome 
of a cyanophage (S-SSM7) isolated from a marine 
cyanobacterium [Synechococcus WH8109). For 
comparison, the predicted bacterial PDF from the 
genome of a similar host, the freshwater cyanobac- 
terium S. elongatus PCC 6301, was also character- 
ized. Both protein sequences display the three 
absolutely conserved motifs (G, H and C) known to 
form the PDF active site (Giglione et aL, 2004), thus 
suggesting that both might be catalytically active 
(Figure 3 a). 

The argument can be made that having a PDF gene 
on board may be more necessary for a cyanophage 
than for phages infecting heterotrophs. A common 
theme of lytic infection by any phage is the shutting 
down of transcription and translation of host 
proteins, with the concomitant redirection of the 
host machinery to phage replication. In the case of 
cyanophages, this is complicated by the need to 
maintain photosynthesis to provide the energy 
(ATP) and reducing power (NADPH) required for 
efficient phage replication (Lindell et aL, 2005). 
Maintenance of photosynthesis, in turn, requires 
ongoing synthesis of functional proteins, the synth- 
esis of Dl being particularly critical because its high 
rate of photo damage requires continual replace- 
ment. Numerous cyanophages encode PSII proteins 
(Dl and at least one other) (Lindell et aL, 2005). In 
Prochlorococcus and Synechococcus, synthesis of 
phage-encoded Dl supplements production of the 
host protein (Lindell et aL, 2005; Clokie et aL, 2006). 
Although it is likely that cyanophages use addi- 
tional tactics (for example, high-light-inducible 
proteins) to maintain a functional PSII, synthesis 
and co-translational processing of Dl is a priority. 

PDFs remove the N-terminal formyl group from at 
least 98% of all bacterial proteins (reviewed in 
Meinnel and Giglione, 2008). Actinonin, an anti- 
microbial produced by Streptomyces roseopallidus 
(Giglione et aL, 2004), is a universal PDF inhibitor. 
When PDF activity in plant chloroplasts is inhibited 
by actinonin in vivo, photosynthetic function 
declines and an albino phenotype is typically 
evident (Giglione et aL, 2003; Hou et aL, 2004), 
demonstrating an essential role for PDFs in main- 
taining photosynthesis. In vascular plants, actinonin 
reduces Dl synthesis and PSII assembly, leading to 
reduced PSII function (Hou et aL, 2004). Notably, 
the PDF1B in Arabidopsis chloroplasts has a higher 
catalytic efficiency when deformylating Dl com- 
pared with other proteins (Dirk et aL, 2002). 

The predicted PDF encoded by phage S-SSM7 was 
expressed and was found to have a high level of 
deformylase activity and sensitivity to actinonin, a 
universal PDF inhibitor (Figure la). Its closest 
homolog, the type IB PDF from A. thaliana chlor- 
oplasts (Figure 2b), is more efficient at deformylat- 
ing Dl than other proteins (Dirk et aL, 2002). 
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Likewise, catalytic efficiency (ic cat /iC m ) of the phage 
PDF is higher on the Dl-derived tetrapeptide 
substrates (Figure 2b). In contrast, the bacterial 
PDF from S. elongatus PCC 6301 does not show 
this preference (Table 2). That the differences in 
efficiency were less when tetrapeptides from ribo- 
somal proteins served as the substrate suggests that 
the greater efficiency of the phage PDF on the Dl- 
derived tetrapeptides cannot be attributed solely to 
the higher activity (larger k cat ) of the phage enzyme. 
The factors responsible for this difference were not 
apparent in the actinonin-bound enzyme structures 
and their elucidation would presumably require 
obtaining structures of the PDF enzymes bound to 
substrate molecules. Attempts were made to co- 
crystallize the enzymes with the substrates, and to 
soak existing crystals with peptide substrates, but 
substrate bound structures could not be obtained 
(data not shown). 

These observations suggest that this cyanophage- 
encoded PDF assists in the maintenance of an active 
PSII during infection. Moreover, all 70 of the PDFs 
identified in marine metagenomes are from 
cyanophage genomes, thus indicating that carrying 
a PDF gene is a tactic employed by many marine 
cyanophage (personal communication; Sharon et aL, 
2011). These phage-encoded PDF genes are also 
widely distributed in the world's oceans (Figure 1). 
Combined, these factors indicate that cyanophage 
PDFs have a significant role in the global interplay 
between cyanophage and their hosts. Comparison of 
the solved structures of both the phage and bacterial 
PDFs to the Type IB PDF from A. thaliana 
chloroplasts demonstrated striking similarity of the 
protein folds, position of the catalytic zinc, and the 
amino-acid residues forming the cavity of the active 
site (Figure 4). The phage and chloroplast enzymes 
also show identical conformation while binding the 
inhibitor actinonin (data not shown). Phylogenetic 
analysis locates both the phage and the bacterial 
PDFs within the Type IB group (Figure 3). As noted 
above, the phage PDF and its closest homolog, the A. 
thaliana chloroplast PDF — but not the bacterial 
enzyme from S. elongatus — demonstrate preferen- 
tial specificity for Dl-derived tetrapeptide sub- 
strates, reflecting the importance of efficient 
deformylation of this protein. The source of this 
substrate preference, despite the close similarity of 
their active sites, has been postulated to be due to 
the tyrosine moiety at position 178 located at the 
entry to the active site where it might sterically 
hinder the entrance of non-Dl-like N-terminal 
polypeptides (Dirk et aL, 2008). A tyrosine occupies 
the same location in the bacterial PDF, but in the 
phage protein this tyrosine is replaced by an 
asparagine (N99) pointing in the opposite direction 
(Figure 4d). 

The Synechococcus phage S-SSM7 lacks the 
C-terminal oc-helical domain that is a defining 
character of Type IB PDFs. There is considerable 
variability in the length and secondary structure of 



this domain in PDFl (Giglione et aL, 2009), and 
the domain is not required for catalytic activity 
(Meinnel et aL, 1996). Nevertheless, it is striking 
that the domain is lacking in all three PDFs 
identified in sequenced phage genomes to date 
(that is, Synechococcus phage S-SSM7, and the 
vibriophages VP16C and VP16T). It is also lacking 
from at least 52 of the 70 marine viral PDF sequences 
identified in the marine metagenomes (Sharon et aL, 
2011; Beja, personal communication). It was 
recently proposed, based on studies of the Type 1 
PDF of E. coli, that Type 1 PDF proteins interact with 
the ribosome via their helical C-terminal domain 
(Bingel-Erlenmeyer et aL, 2008; Giglione et aL, 2009; 
Kramer et aL, 2009). According to this model, the 
C-terminal domain assists in positioning the enzyme 
in close proximity to the translation ribosome exit 
tunnel, thereby providing efficient co-translational 
processing. In addition to supporting biochemical 
and structural evidence, experiments in vivo under 
PDF-limiting conditions found that truncation of the 
C-terminal helix reduced the bacterial growth rate 
(Bingel-Erlenmeyer et aL, 2008). 

However, the PDF of Synechococcus phage 
S-SSM7, without a C-terminal domain, has higher 
catalytic activity in vitro than the bacterial PDF from 
S. elongatus that possesses the domain (Table 2, 
Figure 2). It is not known whether the truncation 
might have contributed to the higher activity of the 
phage enzyme. Microbially-derived genes within 
phage genomes tend to be shorter than their 
microbial counterparts (Daubin and Ochman, 
2004), suggesting that perhaps these phages, for 
genomic economy, have eliminated a domain that is 
not essential during infection. Alternatively, it is 
possible that the truncation enables the phage PDF 
to compete more effectively with the host enzyme 
for access to the ribosome. 

Encoding AMGs gives phage the potential to 
modulate host metabolism and energy flow to favor 
their own replication. Countering environmental 
stressors and maintaining essential host systems 
appear to be the priorities. Increasingly, we are 
aware that the well-studied example of the photo- 
system genes (for example, psbA that encodes Dl) 
carried by cyanophages is but one of many sophis- 
ticated mechanisms used to ensure their efficient 
replication. Another example: carbon metabolism 
genes carried by several cyanophages, including the 
Calvin cycle inhibitor protein CP12, serve to shuttle 
energy (ATP) and reducing power (NADPH) away 
from carbon fixation and into anabolic pathways 
(for example, dNTP synthesis), thereby facilitating 
phage replication (Lindell et aL, 2005; Sharon et aL 
2009; Thompson et aL, 2011). The cyanophage- 
encoded PDF characterized here adds a novel 
strategy contributing to the maintenance of photo- 
synthetic processes during infection and redirection 
of their products toward phage replication. The 
enzymatic and structural studies reported here also 
demonstrate that the structure, efficiency and 
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specificity of acquired AMGs can be subsequently 
fine-tuned to better serve the interests of the phage. 
Further work is needed to monitor expression of 
both phage and host PDF during cyanophage 
infection and to measure the contribution of the 
phage PDF to co-translational protein processing. 
Also remaining to be elucidated is how the function 
and efficiency of phage PDFs are influenced by the 
absence of a C-terminal alpha helix. 
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